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Program description^ 


Failure Free Reading is a language development program 
designed to improve vocabulary, fluency, word recognition, 
and reading comprehension for Kindergarten through grade 
12 students who score in the bottom 15% on standardized 
tests and who have not responded to conventional beginning 
reading instruction. The three key dimensions of the program 


are repeated exposure to text, predictable sentence structures, 
and story concepts that require minimal prior knowledge. The 
program combines systematic, scripted teacher instruction, 
talking software, workbook exercises, and independent reading 
activities. The program is delivered through small group or 
individual instruction. 


Research 


One study of Failure Free Reading met the What Works Clearing- 
house (WWC) evidence standards. This study included 93 
students from third grade in Pennsylvania.^ 


The WWC considers the extent of evidence for Failure Free 
Reading to be small for alphabetics, fluency, and comprehen- 
sion. No studies that met WWC evidence standards with or 
without reservations addressed general reading achievement. 



Effectiveness Failure Free Reading was found to have no discernible effects on alphabetics and fluency, and potentially positive effects on 
comprehension. 









General reading 


Alphabetics 


Fluency 


Comprehension 


achievement 



Rating of effectiveness 


No discernible effects 


No discernible effects 


Potentially positive na 


Improvement index^ 


Average: +1 percentile 


Average: +2 percentile 


Average: +10 percentile na 




points 


points 


points 




Range: -3 to +7 percen- 




Range: +7 to +14 percen- 




tile points 




tile points 



na = not applicable 



1. The descriptive information for this program was obtained from pubiiciy avaiiabie sources: the program’s web site ( http:.''/wvi/w.failurefreeonline.corn/ 

index parents. php . downloaded April, 2007) and the research literature (Torgesen et al., 2006). The WWC requests developers to review the program 
description sections for accuracy from their perspective. Further verification of the accuracy of the descriptive information for this program is beyond 
the scope of this review. 

2. The evidence presented in this report is based on available research. Findings and conclusions may change as new research becomes available. 

3. These numbers show the average and range of student-level improvement indices for all findings in the study. 
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Developer and contact 

Failure Free Reading is distributed by Failure Free Read- 
ing. Address: 140 Cabarrus Ave. W., Concord, NC 28025. 

Web: http://www.failurefreeoniine.com/index parents. php . 
Teiephone: (800) 542-2170. 

Scope of use 

Failure Free Reading was founded in 1988 as JFL Enterprises, 
Inc. In 1996 it became Failure Free Reading and since then has 
been impiemented in approximately 7,500 schools across the 
United States. 

Teaching 

Failure Free Reading uses a modei of repetition, text controi, 
and student performance feedback to scaffoid fiuency and 
comprehension skills. Students read material designed to 
be of interest at their grade/age level. Students learn to read 
words, sentences, passages, and Lexile-leveled stories through 
repeated presentations, listening, discussions, readings, and 
reviews. Teachers monitor student progress with criterion-refer- 
enced print and online assessments and reports. The program 
is delivered through small group or individual instruction. The 



level of instruction is determined by the students’ challenge or 
frustration level, based on the assumption that repetition is not 
boring for struggling readers. Failure Free Reading also includes 
the Joseph Readers’ Talking Software that is “reading neutral,” 
meaning that students do not have to know how to read in order 
to learn critical words and passages. In this software, every item 
on the screen can be read aloud to the students. Verbal Master 
Software & Print is another available software, which aims to 
promote spelling, vocabulary, reading, and composition skills. 

Failure Free Reading provides product training and staff 
development. Training sessions address classroom manage- 
ment, education plans for students, parent involvement, teacher 
communications, and reporting. Follow-up visits and access to 
online technical and telephone support are included. Three-day 
intensive “train the trainer” sessions are available for district-level 
implementation. 

Cost 

Failure Free Reading costs from $300 for a single online student 
subscription to $37,500 for a full school implementation, based 
on multi-platform, networked software. Training costs range from 
$750 to $2,500, plus trainer expenses. 



Fifty-nine studies reviewed by the WWC investigated the effects 
of Failure Free Reading. One study (Torgesen et al., 2006) was a 
randomized controlled trial that met WWC evidence standards. 
The remaining 58 studies did not meet evidence screens. 

Torgesen et al. (2006) examined the effects of Failure Free 
Reading on 93 third-grade students in eight school units'^ in 
Pennsylvania. Students in the comparison group participated in 
the regular reading program at their schools. 



Extent of evidence 

The WWC categorizes the extent of evidence in each domain as 
small or moderate to large (see the What Works Clearinghouse 
Extent of Evidence Categorization Schemed The extent of 
evidence takes into account the number of studies and the 
total sample size across the studies that met WWC evidence 
standards with or without reservations.® 



4. A school unit consists of several partnered schools so that the cluster included two third-grade and two fifth-grade instructional groups. Because of the 
age range defined by the Beginning Reading review, only data on the third-grade students were included in this review. 

5. The Extent of Evidence Categorization was developed to tell readers how much evidence was used to determine the intervention rating, focusing on the 
number and size of studies. Additional factors associated with a related concept, external validity, such as the students’ demographics and the types of 
settings in which studies took place, are not taken into account for the categorization. 
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Research (continued) 



Effectiveness 



The WWC found Failure 
Free Reading to have no 
discernible effects on 
alphabetics and fluency, 
and potentially positive 
effects on comprehension 
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The WWC considers the extent of evidence for Failure Free No studies that met WWC evidence standards with or without 
Reading to be smail for alphabetics, fiuency, and comprehension, reservations addressed general reading achievement. 



Findings 

The WWC review of interventions for beginning reading 
addresses student outcomes in four domains: alphabetics, 
fluency, comprehension, and generai reading achievement.® 
Torgesen et al. (2006) addressed three domains: aiphabetics, 
fluency, and comprehension. 

Alphabetics. Torgesen et al. (2006) examined four phonics 
outcomes in the aiphabetics domain (Woodcock Reading 
Mastery Test-Revised (WRMT-R): Word Identification and 
Word Attack subtests and the Test of Word Reading Efficiency 
(TOWRE): Phonetic Decoding Efficiency and Sight Word Effi- 
ciency subtests). The authors reported that Failure Free Reading 
did not have a statisticaliy significant effect on any of the four 
outcomes. The average effect size across the three outcomes 
was neither statistically significant nor iarge enough to be con- 
sidered substantively important according to the WWC criteria 
(that is, an effect size of ieast 0.25). 

Fluency. Torgesen et ai. (2006) examined one outcome in 
this domain (the Oral Reading Fiuency test) and reported no 



statisticaiiy significant effect for this outcome. The effect size not 
large enough to be considered substantively important. 

Comprehension. Torgesen et al. (2006) examined two 
outcomes in this domain (WRMT-R: Passage Comprehension 
subtest and Group Reading Assessment and Diagnostic Evalua- 
tion (GRADE): Passage Comprehension subtest) and reported no 
statisticaliy significant effects. The average effect size across the 
two outcomes was iarge enough to be considered substantively 
important. 

Rating of effectiveness 

The WWC rates the effects of an intervention in a given outcome 
domain as positive, potentiaily positive, mixed, no discernibie 
effects, potentialiy negative, or negative. The rating of effective- 
ness takes into account four factors: the quality of the research 
design, the statisticai significance of the findings,^ the size of 
the difference between participants in the intervention and the 
comparison conditions, and the consistency in findings across 
studies (see the WWC Intervention Rating Scheme) . 



Improvement index 

The WWC computes an improvement index for each individuai 
finding. In addition, within each outcome domain, the WWC 
computes an average improvement index for each study and 
an average improvement index across studies (see Technicai 
Detaiis of WWC-Conducted Computations) . The improvement 
index represents the difference between the percentile rank 
of the average student in the intervention condition versus 
the percentile rank of the average student in the comparison 



condition. Uniike the rating of effectiveness, the improvement 
index is based entirely on the size of the effect, regardiess of 
the statistical significance of the effect, the study design, or the 
anaiyses. The improvement index can take on vaiues between 
-50 and +50, with positive numbers denoting results favorable to 
the intervention group. 

The average improvement index for alphabetics is +1 percen- 
tile points across all findings in the singie study, with a range 
of -3 to +7 percentiie points. The average improvement index 



6. For definitions of the domains, see the +ec:i rn i ic: I- : cn' o 

7. The ievei of statisticai significance was reported by the study authors or, where necessary, caicuiated by the WWC to correct for ciustering within ciass- 
rooms or schoois and for muitiple comparisons. For an expianation, see the WWC Tutoriai on Mismatch . See the Technicai Details of WWC-Conducted 
Computations for the formuias the WWC used to caicuiate the statisticai significance, in the case of Failure Free Reading, corrections for muitipie 
comparisons were needed. 
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The WWC found Failure 
Free Reading to have no 
discernible effects on 
alphabetics and fluency, and 
potentially positive effects 
on comprehension (continued) 
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for fluency is +2 percentile points for the single outcome. The 
average improvement index for comprehension is +10 percentile 
points across all findings in the single study, with a range of +7 to 
+14 percentile points. 
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Appendix 



Appendix A1 


Study characteristics: Torgesen et ai., 2006 (randomized controiied triai) 


Characteristic 


Description 


Study citation 


Torgesen, J., Myers, D., Schirm, A., Stuart, E., Vartivarian, S., Mansfield, W., et al. (2006). National assessment of Title 1 interim report — Volume II: Closing the reading gap: 
First year findings from a randomized trial of four reading interventions for striving readers. Retrieved from institute of Education Sciences, U.S. Department of Education Web 
site: httD://www.ed.aov/rschstat/evai/disadv/titie1interimreoort/index.html 


Participants 


The study design was based on random assignment of 37 schooi units^ to one of four interventions: Corrective Reading, Kaplan SpellRead, Failure Free Reading, or Wilson 
Reading. Within each school, students were randomly assigned to the intervention or to the comparison condition. This report focuses on eight school units assigned to Failure 
Free Reading.^ At the time of analysis, the sample included 93 third-grade students (55 in intervention and 38 in comparison groups). The number of students at baseline was 
not reported.^ Students were eligible for participation in the study if they were identified as struggling readers by their teachers and if they scored at or below the 30th percen- 
tile on a word-level reading test and at or above the 5th percentile on a vocabulary test. The intervention group had 24% African-American students and the comparison group 
had 19%. The remaining students were Caucasian. Forty-five percent of the intervention group and 49% of the comparison group students were eligible for free/reduced 
lunch. 


Setting 


Eight school units in Pennsylvania. 


Intervention 


Failure Free Reading ms implemented by 10 teachers. According to the study, almost all students in the intervention group received some of the treatment and a very large 
percentage received 80 or more hours of instruction. The intervention was administered in three ways: large-group reading instruction was delivered by a general education 
teacher most of the week, pull-out instruction in groups of three students with mixed levels of basic reading skills occurred for about six hours a week, and one-on-one 
instruction was delivered by a reading specialist for less than one hour a week. Implementation fidelity was analyzed by reading program trainers who observed the teachers 
and coached them over several months, project coordinators who observed a sample of instructional sessions, and ratings based on a sample of videotaped sessions. 
Implementation was rated as acceptable. 


Comparison 


The comparison group students received their regular reading instruction, which included typical classroom instruction and, in many cases, other services (such as another 
pull-out program). The comparison group students had fewer small-group instructional hours than the intervention group students, but more one-on-one instructional hours. 


Primary outcomes 
and measurement 


The primary outcome measures in the alphabetics domain were the Word Identification and Word Attack subtests of the Woodcock Reading Mastery Tests-Revised 
(WRMT-R) and the Phonemic Decoding Efficiency and the Sight Words Efficiency subtests of the Test of Word Reading Efficiency (TOWRE). The primary measure in the flu- 
ency domain was the Oral Reading Fluency test. The primary measures in the comprehension domain were the Passage Comprehension subtest of WRMT-R and the Passage 
Comprehension subtest of Group Reading Assessment and Diagnostic Evaluation (GRADE). (See Appendix A2.1-2.3 for more detailed descriptions of outcome measures.) 


Teacher training 


Professional development included training and coaching by reading program staff, independent study of program materials, and telephone conferences. On average, interven- 
tion group teachers participated in 70.8 professional development hours across all phases of the study (initial training phase, practice phase, and implementation phase). 



1. A school unit consists of several partnered schools so that the cluster included two third-grade and two fifth-grade instructional groups. 

2. Findings on Corrective Reading, Kaplan SpellRead, and Wilson Reading are included in other WWC Beginning Reading reports. 

3. The study reported that two students in the intervention group and three students in the comparison group were lost to analysis. However, it is not clear if those students were in third grade or 
were part of an additional sample of fifth-grade students that was also examined in this study. The fifth-grade sample included in this study is not reviewed in this report because it is outside the 
scope of the review. For sample relevancy criteria, please see the Beginning Reading Protocol . 
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Appendix A2.1 Outcome measures in the aiphabetics domain 



Outcome measure 


Description 


Test of Word Reading 
Efficiency (TOWRE): Phonetic 
Decoding Efficiency subtest 


The TOWRE is a standardized, nationally normed measure. The phonetic decoding efficiency subtesf measures fhe number of pronounceable prinfed nonwords fhat can be 
accurately decoded within 45 seconds (as cited in Torgesen et at, 2006). 


TOWRE; Sight Word 
Efficiency subtest 


The TOWRE is a standardized, nationally normed measure. The sight word efficiency subtest assesses the number of real printed words that can be accurately identified within 
45 seconds (as cited in Torgesen et al., 2006). 


Woodcock Reading Mastery 
Test-Revised (WRMT-R): 
Word Identification subtest 


The word identification subtest is a test of decoding skills. The standardized test requires the child to read aloud isolated real words that range in frequency and difficulty (as 
cited in Torgesen et al., 2006). 


WRMT-R; Word 
Attack subtest 


This standardized test measures phonemic decoding skills by asking students to read pseudowords. Students are aware that the words are not real (as cited in Torgesen et al., 
2006). 



Appendix A2.2 Outcome measure in the fluency domain 



Outcome measure 


Description 


Edformation Oral 
Fluency Assessment 


This test measures the number of words correct per minute (WCPM) that students read using three brief grade-level passages (AIMSweb, as cited in Torgesen et al., 2006). 
These passages include both fiction and nonfiction text. The norms for this test are updated by Edformation each school year. 



Appendix A2.3 Outcome measures in the comprehension domain 



Outcome measure 


Description 


Group Reading Assessment 
and Diagnostic Evaluation 
(GRADE); Passage 
Comprehension subtest 


The GRADE is an untimed norm-referenced standardized test. The passage comprehension subtest includes a passage of text and corresponding multiple-choice comprehen- 
sion questions (as cited in Torgesen et al., 2006). 


WRMT-R; Passage 
Comprehension subtest 


In this standardized test, comprehension is measured by having students fill in missing words in a short paragraph (as cited in Torgesen et al., 2006). 
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Appendix A3.1 



Summary of study findings inciuded in the rating for the aiphahetics domain^ 









Authors’ findings from the study 














Mean outcome 
(standard deviation^) 




WWC calculations 




Outcome measure 


Study 

sample 


Sample size 
(school units/ 
students) 


Failure Free 

Reading Comparison 

group group 


Mean difference^ 
{Failure Free 
Reading - 
comparison) 


Statistical 
significance^ 
Effect size^ (at a = 0.05) 


Improvement 

index^ 



Torgesen et al., 2006 (randomized controlled trial)^ 



WRMT-R: Word 
Identification subtest 


Grade 3 


8/93 


88.01 

(15.00) 


86.66 

(15.00) 


1.35 


0.09 


ns 


-f4 


WRMT-R: Word Attack subtest 


Grade 3 


8/93 


89.36 

(15.00) 


89.89 

(15.00) 


-0.53 


-0.04 


ns 


-1 


TOWRE: Phonetic Decoding 
Efficiency subtest 


Grade 3 


8/93 


87.05 

(15.00) 


88.36 

(15.00) 


-1.31 


-0.09 


ns 


-3 


TOWRE: Sight Word 
Efficiency subtest 


Grade 3 


8/93 


90.01 

(15.00) 


87.39 

(15.00) 


2.62 


0.17 


ns 


+7 


Domain average^ for alphabetics 












0.04 


ns 


+^ 



ns = not statistically significant 

1 . This appendix reports findings considered for the effectiveness rating and the average improvement indices. The study also included subgroup analyses by initial skill level (WRMT-R: Word Attack subtest and Peabody Picture Vocabu- 
lary Test (PPVT)) and socio-economic status. No differences were found between subgroups of students for outcomes in the alphabetics domain. 

2. The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes. The standard 
deviations in Torgesen et al. (2006) were population standard deviations. 

3. Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. The intervention group mean is the comparison group mean plus the mean difference. 

4. For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations . 

5. Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups. 

6. The improvement index represents the difference between the percentile rank of the average student in the intervention condition versus the percentile rank of the average student in the comparison condition. The improvement index 
can take on values between -50 and -t-50, with positive numbers denoting results favorable to the intervention group. 

7. The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the 
clustering correction, see the WWC Tutorial on Mismatch . See Technical Details of WWC-Conducted Computations for the formulas the WWC used to calculate statistical significance. In the case of Torgesen et al. (2006) and alphabet- 
ics, no corrections for clustering were needed. Corrections for multiple comparisons were needed because the study’s reported corrections for multiple comparisons were based on a grouping of outcomes that differed from the groups 
of domains for this review. 

8. This row provides the study average, which, in this instance, is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated 
from the average effect size. 
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Appendix A3.2 Summary of study findings inciuded in the rating for the fluency domain^ 









Authors’ findings from the study 














Mean outcome 
(standard deviation^) 




WWC calculations 




Outcome measure 


Study 

sample 


Sample size 
(school units/ 
students^) 


Failure Free 

Reading Comparison 

group group 


Mean difference^ 
{Failure Free 
Reading - 
comparison) 


Statistical 
significance^ 
Effect size® (at a = 0.05) 


Improvement 

index^ 









Torgesen et al., 2006 


(randomized controlled trial)® 








Oral Reading Fluency 


Grade 3 


8/93 


56.89 


55.03 


1.86 


0.05 


ns 


+2 








(39.20) 


(39.20) 










Domain average® for fluency 












0.05 


ns 


+2 



ns = not statistically significant 

1 . This appendix reports findings considered for the effectiveness rating and the average improvement indices. The study also included subgroup analyses by initial skill level (WRMT-R: Word Attack subtest and Peabody Picture Vocabu- 
lary Test (PPVT)) and socio-economic status. No differences were found between subgroups of students for outcomes in the fluency domain. 

2. The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes. The standard 
deviations in Torgesen et al. (2006) are the population standard deviations for these standardized outcomes. 

3. The sample size for the analysis was not reported in Torgesen et al. (2006). The sample size reported is the total number of third-grade students in the intervention and control conditions at baseline, which may differ from the actual 
number of students used in the various analysis in the report. 

4. Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. The intervention group mean is the comparison group mean plus the mean difference. 

5. For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations . 

6. Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups. 

7. The improvement index represents the difference between the percentile rank of the average student in the intervention condition versus the percentile rank of the average student in the comparison condition. The improvement index 
can take on values between -50 and -t-50, with positive numbers denoting results favorable to the intervention group. 

8. The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the 
clustering correction, see the WWC Tutorial on Mismatch . See Technical Details of WWC-Conducted Computations for the formulas the WWC used to calculate statistical significance. In the case of Torgesen et al. (2006) and the fluency 
domain, no corrections for clustering were needed. No corrections for multiple comparisons were needed because there is only one outcome in this domain. 

9. This row provides the domain average, which, in this instance, is also the study finding for the single outcome. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improve- 
ment index is calculated from the average effect size. 
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Appendix A3.3 Summary of study findings inciuded in the rating for the comprehension domain^ 









Authors’ findings from the study 














Mean outcome 
(standard deviation^) 




WWC calculations 




Outcome measure 


Study 

sample 


Sample size 
(school units/ 
students^) 


Failure Free 

Reading Comparison 

group group 


Mean difference^ 
{Failure Free 
Reading - 
comparison) 


Statistical 
significance^ 
Effect size® (at a = 0.05) 


Improvement 

index^ 



Torgesen et al., 2006 (randomized controlled trial)^ 



GRADE: Passage 


Grade 3 


8/93 


83.71 


78.43 


5.28 


0.35 


ns 


+14 


Comprehension subtest 






(15.00) 


(15.00) 










WRMT-R: Passage 
Comprehension subtest 


Grade 3 


8/93 


90.38 

(15.00) 


87.65 

(15.00) 


2.73 


0.18 


ns 


+7 



Domain average^ for comprehension 0.26 ns +10 



ns = not statistically significant 

1 . This appendix reports findings considered for the effectiveness rating and the average improvement indices. The study also included subgroup analyses by initial skill level (WRMT-R: Word Attack subtest and Peabody Picture Vocabu- 
lary Test (PPVT)) and socio-economic status. No differences were found between subgroups of students for outcomes in the comprehension domain. 

2. The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes. 

3. The sample size for the analysis was not reported in Torgesen et al. (2006). The sample size reported is the total number of third-grade students in the intervention and control conditions at baseline, which may differ from the actual 
number of students used in the various analysis in the report. 

4. Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. The intervention group mean is the comparison group mean plus the mean difference. 

5. For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations . 

6. Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups. 

7. The improvement index represents the difference between the percentile rank of the average student in the intervention condition versus the percentile rank of the average student in the comparison condition. The improvement index 
can take on values between -50 and +50, with positive numbers denoting results favorable to the intervention group. 

8. The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the 
clustering correction, see the WWC Tutorial on Mismatch . See Technical Details of WWC-Conducted Computations for the formulas the WWC used to calculate statistical significance. In the case of Torgesen et al. (2006), no correction 
for clustering was needed and the comprehension domain. No corrections for multiple comparisons were needed because the study’s reported corrections for multiple comparisons were based on the same grouping of outcomes as the 
domain for this review. 

9. This row provides the domain average, which, in this instance, is also the study average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated 
from the average effect size. 
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Appendix A4.1 Failure Free Reading rating for the aiphabetics domain 



The WWC rates an intervention’s effects in a given outcome domain as positive, potentiaily positive, mixed, no discernibie effects, potentialiy negative, or negative.'' 

For the outcome domain of aiphabetics, the WWC rated Failure Free Reading as having no discernible effects. It did not meet the criteria for other ratings (positive 
effects, potentiaily positive effects, mixed effects, potentialiy negative effects, and negative effects) because the single study that met WWC standards did not show 
statistically significant or substantiveiy important effects. 

Rating received 

No discernible effects: No affirmative evidence of effects. 

• Criterion 1: None of the studies shows a statisticaiiy significant or substantively important effect, either pos/f/ve or negative. 

Met. No studies showed statisticaiiy significant or substantively important positive or negative effects. 

Other ratings considered 

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statisticaiiy significant positive effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statisticaiiy significant positive effects. 

AND 

• Criterion 2: No studies showing statisticaiiy significant or substantively important negative effects. 

Met. No studies showed statisticaiiy significant or substantively important negative effects. 

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important positive effect. 

Not met. No studies showed statisticaiiy significant or substantively important positive effects. 

AND 

• Criterion 2: No studies showing a statisticaiiy significant or substantively important negative effect and fewer or the same number of studies showing indeterminate 
effects than showing statisticaiiy significant or substantively important positive effects. 

Not met. The single study that met WWC standards showed indeterminate effects. 

Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria. 

• Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant 
or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect. 

Not met. No studies showed statistically significant or substantively important effects, either positive or negative. 

OR 

• Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a 
statistically significant or substantively important effect. 

Not met. No studies showed statistically significant or substantively important effects, either positive or negative. 

(continued) 
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Appendix A4.1 Failure Free Reading rating for the aiphabetics domain (continued) 



Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important negative effect. 

Not met. No studies showed statisticaiiy significant or substantively important negative effects. 

AND 

• Criterion 2: No studies showing a statistically significant or substantively important positive effect, or more studies showing statisticaiiy significant or substantively 
important negative effects than showing statistically significant or substantively important positive effects. 

Met. No studies showed statistically significant or substantively important positive effects. In addition, no studies showed a statistically significant 
or substantively important negative effect. 

Negative effects: Strong evidence of a negative effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statistically significant negative effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statistically significant negative effects. 

AND 

• Criterion 2: No studies showing statistically significant or substantively important positive effects. 

Met. No studies showed statistically significant or substantively important positive effects. 

1 . For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of 
potentially positive or potentially negative effects. See the WWC Intervention Rating Scheme for a complete description. 
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Appendix A4.2 Failure Free Reading rating for the fluency domain 



The WWC rates an intervention’s effects in a given outcome domain as positive, potentiaily positive, mixed, no discernibie effects, potentialiy negative, or negative.'' 

For the outcome domain of fluency, the WWC rated Failure Free Reading as having no discernibie effects. It did not meet the criteria for other ratings (positive 
effects, potentiaily positive effects, mixed effects, potentialiy negative effects, and negative effects) because the single study that met WWC standards did not show 
statistically significant or substantiveiy important effects. 

Rating received 

No discernible effects: No affirmative evidence of effects. 

• Criterion 1: None of the studies shows a statisticaiiy significant or substantively important effect, either pos/f/ve or negative. 

Met. No studies showed statisticaiiy significant or substantively important positive or negative effects. 

Other ratings considered 

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statisticaiiy significant positive effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statisticaiiy significant positive effects. 

AND 

• Criterion 2: No studies showing statisticaiiy significant or substantively important negative effects. 

Met. No studies showed statisticaiiy significant or substantively important negative effects. 

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important positive effect. 

Not met. No studies showed statisticaiiy significant or substantively important positive effects. 

AND 

• Criterion 2: No studies showing a statisticaiiy significant or substantively important negative effect and fewer or the same number of studies showing indeterminate 
effects than showing statisticaiiy significant or substantively important positive effects. 

Not met. The single study that met WWC standards showed indeterminate effects. 

Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria. 

• Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant 
or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect. 

Not met. No studies showed statistically significant or substantively important effects, either positive or negative. 

OR 

• Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a 
statistically significant or substantively important effect. 

Not met. No studies showed statistically significant or substantively important effects, either positive or negative. 

(continued) 
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Appendix A4.2 Failure Free Reading rating for the fluency domain (continued) 



Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important negative effect. 

Not met. No studies showed statisticaiiy significant or substantively important negative effects. 

AND 

• Criterion 2: No studies showing a statistically significant or substantively important positive effect, or more studies showing statisticaiiy significant or substantively 
important negative effects than showing statistically significant or substantively important positive effects. 

Met. No studies showed statistically significant or substantively important positive effects. In addition, no studies showed a statistically significant 
or substantively important negative effect. 

Negative effects: Strong evidence of a negative effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statistically significant negative effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statistically significant negative effects. 

AND 

• Criterion 2: No studies showing statistically significant or substantively important positive effects. 

Met. No studies showed statistically significant or substantively important positive effects. 

1 . For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of 
potentially positive or potentially negative effects. See the WWC Intervention Rating Scheme for a complete description. 
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Appendix A4.3 Failure Free Reading rating for the comprehension domain 



The WWC rates an intervention’s effects in a given outcome domain as positive, potentiaily positive, mixed, no discernibie effects, potentialiy negative, or negative.'' 

For the outcome domain of comprehension, the WWC rated Failure Free Reading as having potentially positive effects. It did not meet the criteria for positive effects 
because it had only one study, and that study did not show statisticaily significant positive effects. The remaining ratings (mixed effects, no discernibie effects, poten- 
tiaily negative effects, and negative effects) were not considered because Failure Free Reading was assigned a higher appiicabie rating. 



Rating received 

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important positive effect. 

Met. One study showed a substantively important positive effect. 

AND 

• Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate 
effects than showing statisticaily significant or substantiveiy important positive effects. 

Met. No studies showed a statistically significant or substantiveiy important negative effect. The single study that met the WWC standards showed 
a substantively important positive effect. 

Other ratings considered 

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statisticaiiy significant positive effects, at ieast one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statisticaiiy significant positive effects. 

AND 

• Criterion 2: No studies showing statisticaiiy significant or substantively important negative effects. 

Met. No studies showed statistically significant or substantively important negative effects. 

1. For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of 
potentially positive or potentially negative effects. See the WWC Intervention Rating Scheme for a complete description. 
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Appendix A5 Extent of evidence by domain 



Outcome domain 


Number of studies 


Sampie size 

School units 


Students 


Extent of evidence^ 


Alphabetics 


1 


8 


93 


Small 


Fluency 


1 


8 


93 


Small 


Comprehension 


1 


8 


93 


Small 


General reading achievement 


0 


0 


0 


na 



na = not applicable/not studied 

1 . A rating of “moderate to large” requires at least two studies and two schools across studies in one domain, and a total sample size across studies of at least 350 students or 14 classrooms. 
Otherwise, the rating is “small.” 
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